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SEQUENCE LISTING 

Aaron Kaplan et al . 

ENHANCING INORGANIC CARBON FIXATION BY 
PHOTOSYNTHETIC ORGANISMS 
9 



(A) 
(B) 
(C) 
(D) 
(E) 
(F) 



ADDRESSEE : 
STREET : 
CITY: 
STATE : 
COUNTRY : 
ZIP: 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: 



Mark M. Friediaan c/o Anthony Castorina 
2001 Jefferson Davis Highway, Suite 207 
Arlington 
Virginia 

United States of America 
22202 

1.44 megabyte, 3.5" microdisk 
Twinhead Sliinnote-890TX 
MS DOS version 6.2, 
Windows version 3.11 

Word for Windows version 2.0 converted 
an ASCI file 



CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
ATTORNEY /AGENT INFORMATION: 

(A) NAME: 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/DOCKET NUMBER: 
TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 

(B) TELEFAX: 

(C) TELEX: 



Friedmam, Mark M. 

33,883 

325/45 

972-3-5625553 
972-3-5625554 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 





(A) 


LENGTH : 


4957 








(B) 


TYPE : 


nucleic acid 






(C) 


STRANDEDNESS: double 








(D) 


TOPOLOGY: 


linear 






(xi 


) SEQUENCE DESCRIPTION: SEQ ID 


NO: 1 : 




AAGCTTGGAT 


TGAAGCGATC 


GGGGTCAATC 


CCAGCGATGA 


TCCTCAGTTC 


50 


CTCCTGATGG 


TCGATCCCTT 


TAGCGCCAAG 


ATTGAGGATC 


TGCTGCAAGG 


100 


GCTGGATTTC 


GCCTATCCCG 


AGGCCGTGAA 


AGTGGGCGGA 


TTGGCCAGTG 


150 


GTTTGGGGGC 


AGAGTCAGCG 


ATCGCCAGCT 


TGTTTrrTCA 


AGACCGACAG 


200 


GTCGATGGCG 


TGATTGGGCT 


AGCCCTCAGT 


GGCAATGTCC 


AGCTGCAGGC 


250 


GATCGTGGCT 


CAGGGCTGTC 


GTCCAGTTGG 


CCCGCTTTGG 


CATGTGGCAG 


300 


CGGCGGAGCG 


CAACATTCTG 


CGGCAACTTC 


AGACCGAAGA 


CGAGGAACCG 


350 


ATCGCCGCGC 


TGCAAGCCCT 


ACAGTCAGTC 


CTGCGTGATC 


TCTCCCCTGA 


400 


ATTACAGCGA 


TCGCTCTGTG 


TGGGCCTGGC 


CTGCAATTCT 


TTCCAAACGG 


450 


TATTACAACC 


GGGCGACTTC 


CTGATCCGTA 


ACCTGCTGGG 


GTTTGATCCC 


500 


CGCACTGGTG 


CTGTAGCAAT 


CGGCGATCGC 


ATTCGAGTTG 


GGCAGCGGCT 


550 


GCAGCTGCAC 


GTACGGGATG 


CCCAGACAGC 


GGCGGATGAC 


CTCGAGCGGC 


600 


AACTGGGGCA 


ATGGTGCCGG 


CAGCATGCGA 


CAAAACCAGC 


AGCTTCCCTC 


550 


TTGTTTTCCT 


GCTTGGGGCG 


CGGCAAGCCC 


TTCTATCAGC 


AGGCCAACTT 


700 
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CGAGTCGCAA CTGATTCAGC ATTACCTCTC 
TTTTCTGTAA TGGCGAAATC GGCCCGATCG 
GGCTACACAT CGGTGCTGGC TTTGCTGTCG 
GCGAGACCTG ATTGTCGATC TGCTGAGCGC 
GCCCGGACCT GAGCAGGCGC ATCGGCCAAG 
AGCCACCCCC GCCCAGAAAT TCCGCAACAT 
CCGCCTCCGA TAAATTCAAC GGCTCATGGG 
TCTGACTGCG ATCGCCATCC ATTCCCGCCG 
TTGATCCGGT AGCGATCGCA CCCGACGGGA 
CTTCAGCGGC AGGTTGTACG GTTCCGAGTC 
ACCAAGGAAC CGGTCGTGAC TTCCCAGAGA 
TTGGATGTGG AGGTGGCCTG TGAAGATCAC 
GGATTGATCG CAATTCCTCG GCATTTTCTA 
GGATGCTGCT GTTGATCGGG CAGATGCTCC 
CACCCAGCGT TGGCTAGCGG TGGAAGTGGC 
TGAGTTGCGC GCAATCGACT CGCCCCCGAT 
TCAAAAGCGA TCGAATTCAG CGCAAACAGA 
GCAGCGATAG TAGGGGCGAT CGCTCGTGAA 
CGACAAACTC GGCCACACCG GTGCGATCGC 
ATATCGTGGT TGCCCGGCAC CACATAGACC 
TTGTTGCAGC AGCCACTGAT GGTTTTCCCG 
CCCCCGGCAG CAACAGGAAG TCCAAATCCA 
ATTTGCTCAA AAGCCGGAAT GCTGCACTCA 
ATGGTGCCAA ATTGTCTGCG GCAGTCCAAT 
CAAATCGAAA CGCTCGGTTC ATTGCCATCC 
TCTAGGCGAA GCTAGGTCGA GTCCGTTGTC 
GCCAGAGTTC GCGTTCGGCA GCACGTCAAT 
AGTGGTCACG ACTTGGCCGG ATTGGCAACA 
GCCCGCTGCA TTTGGATATT GGCTGTGCTC 
ATGGCGACAC GACAACCTGA GTGGAATTAT 
GCCGCTGGTA GATGAGGCGA ACGCGATCGC 
ATCTCTACTA CCACTTCAGC AACGCCAATT 
CGATCGCTGC CGACAGGGAT TTTGCAGCGG 
TCCTTGGTTC AAGAAACGCC ATCAAAAGCG 
TGGTGCAAGC CCTCGCGACT GCGTTACCTG 
CAATCCGATG TGCTGGAAGT GCAGGCAGAG 
GGAACCCCGC TTTCAGCGCA CCTGCTTGGA 
TGCCCGTCCC GACCGAGCGC GAAATTGCCG 
GTCTACCGTG CTCTCTTCAT TCGGCAGCCA 
AAGCGTTGAC GCGATCGCGA TGACTGTCTG 
ATTACCAACC CCAACAGTGG GGCCACAGCA 
GGCAGCCTGC GAGCTTGGCG GGCCTCCAGC 
GGCACTGGGT GGCTTCTTGC TTGCTGTCGT 
TGCCCAGTTC CGCCCTAGGG TTGGGGCTAG 
GCCCTGCTCT CGCTGACAGA TATCGATCTG 
CTGGCTGGTG CTGCTCTACT GGGGCGTCGA 
CACCCGTACG CGCTGCAGCT TTAGTTGGGC 
CTGTTGGTTT TTGCCCTAGC GGCTCGGGTT 
ATCGCTGCTG TTCTCGGTCG TCGTGATCAC 
ACGGCCTCAA CCAATGGATC TACGGCGTTG 
GATCGCAACT CGGTTGCCGA CTTCACCTCA 
CAACCCCAAC CTGCTGGCTG CTTATCTGGT 
CAGCAGCGAT CGGGGTGTGG CGCGGCTGGC 
GCTGCGACAG GTGCGAGCAG CTTATGTCTG 
TGGCTGGCTG GGTTTTGTCG CCATGATTTT 
TCTACTGGTT TCAACCCCGT CTACCCGCAC 
CCAGTCGTAT TGGGTGGACT AGTCGCGGTG 



AGAGCTGCCC 


CTAGCTGGCT 


750 


CTGGCAGCAC 


CTACCTGCAT 


800 


GCCAAAACTC 


ACTAGCGCCA 


850 


GACTGTAGCG 


CTGGAAATAG 


900 


CTGACCGTAG 


TATCACCGTC 


950 


CGGCAGGAGA 


GCGATCGCCT 


1000 


TCAACAGGCG 


GATCAAGTAC 


1050 


AAAACGTTTG 


TAAATCAGTC 


1100 


CTCTAGTTCT 


AGTTGCCAAC 


1150 


GGTAGGGATG 


GGGATAGCTG 


1200 


GCACCTTGCT 


GACTGGTGGC 


1250 


CGAGACGCTG 


CCCGCTTCGA 


1300 


AGATGTAGCG 


CTGACCAAGC 


1350 


AACACATTGT 


GGTGAATCAT 


1400 


GAGTTCTTGT 


TGCAGCCAGT 


1450 


GCAGTTGATG 


GCCCGCTTCA 


1500 


TCGAGATCCG 


GTGCGATCGT 


1550 


GCCAAAGTCT 


TGATAGAGCT 


1600 


GATCGCTCGC 


TGCGGCGGGC 


1650 


GGATAGGGCA 


ACTGGCGCAA 


1700 


CTCCCCGTGC 


TGGGTTAAAT 


1750 


GCGCTGCCAG 


TTCTGTCAGG 


1800 


ATCAAATGGA 


AGCGATGGGG 


1850 


GTGGAGATCG 


CTCAGCAGCG 


1900 


CCTCAGCTAT 


CGAGCCCGAT 


1950 


TTCAGTTGCA 


AGCATTCATG 


2000 


CCGCTCTCTC 


AGAAATTCCA 


2050 


GGTCTATGCG 


GACTGCGATC 


2100 


GCGGGCGCTT 


TCTGCTGGCA 


2150 


CTGGGGCTGG 


AAATTCGTGA 


2200 


CCGCGAACGT 


GAACTGACCA 


2250 


TGGACTTGGA 


ACCGCTGCTG 


2300 


GTCAGCATTC 


AGTTCCCGGA 


2350 


ACGCGTCGTC 


CAGCCGGAAC 


2400 


CTGGTGCAGA 


GGTCTTTCTG 


2450 


ATGTGCGAAC 


ACTTTGCGGC 


2500 


CTGGCTGCCG 


GAAAATCCGC 


2550 


TTCAAAACAA 


ACAGTTGCCA 


2600 


GCGGACTAAG 


CTCTTAAGGC 


2650 


GCAAACTCTG 


ACTTTTGCCC 


2700 


GTTTCTTGCA 


TCGGCTGTTT 


2750 


CAGCTGTTGG 


TTTGGTCTGA 


2800 


CTACGGTTCG 


GCTCCGTTTG 


2850 


CCGCGATCGC 


GGCCTATTGG 


2900 


CGGCAAGCAA 


CCCCCATTCA 


2950 


TGCCCTAGCA 


ACGGGACTCT 


3000 


TAGCCAAACT 


GACGCTCTAC 


3050 


CTCCGCAATC 


CCCGTCTGCG 


3100 


ATCGCTTTTT 


GTCAGTGTCT 


3150 


AAGAGCTGGC 


GACTTGGGTG 


3200 


CGGGTTTACA 


GCTATCTGGG 


3250 


GCCGACGACT 


GCCTTTTCTG 


3300 


TCCCCAAGCT 


GCTGGCGATC 


3350 


ATCCTCACCT 


ACAGTCGCGG 


3400 


TGTCTGGGCG 


TTATTAGGGC 


3450 


CCTGGCGACG 


CTGGCTATTC 


3500 


CTCTTGGTGG 


CGGTGCTTGG 


3550 
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ACTTGAGCCG TTGCGCGTGC GCGTGTTGAG CATCTTTGTG GGGCGTGAflG 3600 
ACAGCAGCAA CAACTTCCGG ATCAATGTCT GGCTGGCGGT GCTGCAGATG 3 650 
ATTCAAGATC GGCCTTGGCT GGGCATCGGC CCCGGCAATA CCGCCTTTAA 3700 
CCTGGTTTAT CCCCTCTATC AACAGGCGCG CTTTACGGCG TTGAGCGCCT 3750 
ACTCCGTCCC GCTGGAAGTC GCGGTTGAGG GCGGACTACT GGGCTTGACG 38 00 
GCCTTCGCTT GGCTGCTGCT GGTCACGGCG GTGACGGCGG TGCGGCAGGT 3850 
GAGCCGACTG CGGCGCGATC GCAATCCCCA AGCCTTTTGG TTGATGGCTA 3900 
GCTTGGCCGG TTTGGCAGGA ATGCTGGGTC ACGGTCTGTT TGATACCGTG 3950 
CTCTATCGAC CGGAAGCCAG TACGCTCTGG TGGCTCTGTA TTGGAGCGAT 4000 
CGCGAGTTTC TGGCAGCCCC AACCTTCCAA GCAACTCCCT CCAGAAGCCG 4 050 
AGCATTCAGA CGAAAAAATG TAGCGGGCTC CCCAACAAAT TCCTGTGCAC 4100 
CCGACTGGAT CCACCACCTA AACTGGATCC CAAAGGTATC CGGTGGATCT 4150 
AGGGTCATAA CGAACTCCGA CCGCGATCGC GTCCGCGAAC TGAACCTCCA 4200 
TCGCACCGAA GCGGAGTTCG TTAGTCGTTG AAGAGCCAAT GCTAGAGGGG 4250 
GCTGCCGAAG CAGTTGGGCT GGAAGCAGGC TGCGAGAAGC CACCCGCATC 4300 
CAAGGCAAAG TTCAGCCGAC CTTCCGCAAA GACTACGATC GCCACGGCGG 4350 
CTCTGCCAGC TAAGTCAGCG CTGGGTTAGT TGTCATAGCA GTCCGCAGAC 4400 
AAGTTAGGAC AACTTCATAG AGGGACTCGC TCAGAGTCAA CAGCCGCTGT 4450 
CCGTGGGGGT GGGCAATCAC CCCCACACCC ACGCACTGGG GGACTCGACT 4500 
CCCCCAGGCC CCCCGCAACA AGATTTCGGA TAAGGGGCAT CGGCTGAATC 4550 
GCGATCGCTG CGGGTAAAAC TAGCCGGTGT TAGCCATGGG TTTGAGACTA 4600 
ATCGGCACGG GGCAAAACGT CCTGATTTAT TTGCTCAATG TGATAGGTTA 4 650 
CATCGTCAAA AACAAGGCCC AAGAGGTAGG AAAAATCACG ACCGCCCAAG 47 00 
TCCGAGGGCT TTGCTGTTGG GAGCGACCTA GGGCAGACTA GACAGAGCAT 4750 
TGCTGTGAGC CAAAGCGCCT TCAATTGCTG GCGGCTGTGG GTTTTTCGGA 48 00 
GGTTGCCAAA TGAAAGACCT TTTCGTCAAT GTCCTCCGCT ATCCCCGCTA 4850 
CTTCATCACC TTCCAGCTGG GTATTTTTTA GTCGATCTAC CAGTGGGTGC 4 900 
GGCCGATGGT TCGCAACCCA GTCGCGGCTT GGGCGCTGCT AGGCTTTGGA 4 950 
GTTTCGA 4957 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1404 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

ATGACTGTCT GGCAAACTCT GACTTTTGCC CATTACCAAC CCCAACAGTG 50 

GGGCCACAGC AGTTTCTTGC ATCGGCTGTT TGGCAGCCTG CGAGCTTGGC 100 

GGGCCTCCAG CCAGCTGTTG GTTTGGTCTG AGGCACTGGG TGGCTTCTTG 150 

CTTGCTGTCG TCTACGGTTC GGCTCCGTTT GTGCCCAGTT CCGCCCTAGG 200 

GTTGGGGCTA GCCGCGATCG CGGCCTATTG GGCCCTGCTC TCGCTGACAG 250 

ATATCGATCT GCGGCAAGCA ACCCCCATTC ACTGGCTGGT GCTGCTCTAC 300 

TGGGGCGTCG ATGCCCTAGC AACGGGACTC TCACCCGTAC GCGCTGCAGC 350 

TTTAGTTGGG CTAGCCAAAC TGACGCTCTA CCTGTTGGTT TTTGCCCTAG 4 00 

CGGCTCGGGT TCTCCGCAAT CCCCGTCTGC GATCGCTGCT GTTCTCGGTC 450 

GTCGTGATCA CATCGCTTTT TGTCAGTGTC TACGGCCTCA ACCAATGGAT 500 

CTACGGCGTT GAAGAGCTGG CGACTTGGGT GGATCGCAAC TCGGTTGCCG 550 

ACTTCACCTC ACGGGTTTAC AGCTATCTGG GCAACCCCAA CCTGCTGGCT 600 

GCTTATCTGG TGCCGACGAC TGCCTTTTCT GCAGCAGCGA TCGGGGTGTG 650 

GCGCGGCTGG CTCCCCAAGC TGCTGGCGAT CGCTGCGACA GGTGCGAGCA 700 

GCTTATGTCT GATCCTCACC TACAGTCGCG GTGGCTGGCT GGGTTTTGTC 750 

GCCATGATTT TTGTCTGGGC GTTATTAGGG CTCTACTGGT TTCAACCCCG 800 

TCTACCCGCA CCCTGGCGAC GCTGGCTATT CCCAGTCGTA TTGGGTGGAC 850 

TAGTCGCGGT GCTCTTGGTG GCGGTGCTTG GACTTGAGCC GTTGCGCGTG 900 

CGCGTGTTGA GCATCTTTGT GGGGCGTGAA GACAGCAGCA ACAACTTCCG 95 0 

GATCAATGTC TGGCTGGCGG TGCTGCAGAT GATTCAAGAT CGGCCTTGGC 1000 
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TGGGCATCGG CCCCGGCAAT ACCGCCTTTA ACCTGGTTTA TCCCCTCTAT 1050 
CAACAGGCGC GCTTTACGGC GTTGAGCGCC TACTCCGTCC CGCTGGAAGT 1100 
CGCGGTTGAG GGCGGACTAC TGGGCTTGAC GGCCTTCGCT TGGCTGCTGC 1150 
TGGTCACGGC GGTGACGGCG GTGCGGCAGG TGAGCCGACT GCGGCGCGAT 1200 
CGCAATCCCC AAGCCTTTTG GTTGATGGCT AGCTTGGCCG GTTTGGCAGG 1250 
AATGCTGGGT CACGGTCTGT TTGATACCGT GCTCTATCGA CCGGAAGCCA 1300 
GTACGCTCTG GTGGCTCTGT ATTGGAGCGA TCGCGAGTTT CTGGCAGCCC 1350 
CAACCTTCCA AGCAACTCCC TCCAGAAGCC GAGCATTCAG ACGAAAAAAT laOO 



GTAG 



1404 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 67 

(B) TYPE: amino acid 

(C) STRANDEDKESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
Met Thr Val Trp Gin Thr Leu Thr Phe Ala His Tyr Gin Pro Gin 
5 10 15 

Gin Trp Gly His Ser Ser Phe Leu His Arg Leu Phe Gly Ser Leu 
20 25 30 

Arg Ala Trp Arg Ala Ser Ser Gin Leu Leu Val Trp Ser Glu Ala 
35 40 45 

Leu Gly Gly Phe Leu Leu Ala Val Val Tyr Gly Ser Ala Pro Phe 
50 55 60 

Val Pro Ser Ser Ala Leu Gly Leu Gly Leu Ala Ala lie Ala Ala 
65 70 75 

Tyr Trp Ala Leu Leu Ser Leu Thr Asp lie Asp Leu Arg Gin Ala 
80 85 90 

Thr Pro lie His Trp Leu Val Leu Leu Tyr Trp Gly Val Asp Ala 
95 100 105 

Leu Ala Thr Gly Leu Ser Pro Val Arg Ala Ala Ala Leu Val Gly 
110 115 120 

Leu Ala Lys Leu Thr Leu Tyr Leu Leu Val Phe Ala Leu Ala Ala 
125 130 135 

Arg Val Leu Arg Asn Pro Arg Leu Arg Ser Leu Leu Phe Ser Val 
140 145 150 

Val Val lie Thr Ser Leu Phe Val Ser Val Tyr Gly Leu Asn Gin 
155 160 165 

Trp He Tyr Gly Val Glu Glu Leu Ala Thr Trp Val Asp Arg Asn 
170 175 180 

Ser Val Ala Asp Phe Thr Ser Arg Val Tyr Ser Tyr Leu Gly Asn 
185 190 195 

Pro Asn Leu Leu Ala Ala Tyr Leu Val Pro Thr Thr Ala Phe Ser 
200 205 210 

Ala Ala Ala lie Gly Val Trp Arg Gly Trp Leu Pro Lys Leu Leu 
215 220 225 

Ala He Ala Ala Thr Gly Ala Ser Ser Leu Cys Leu He Leu Thr 
230 235 240 

Tyr Ser Arg Gly Gly Trp Leu Gly Phe Val Ala Met He Phe Val 
245 250 255 

Trp Ala Leu Leu Gly Leu Tyr Trp Phe Gin Pro Arg Leu Pro Ala 
260 265 270 

Pro Trp Arg Arg Trp Leu Phe Pro Val Val Leu Gly Gly Leu Val 
275 280 285 

Ala Val Leu Leu Val Ala Val Leu Gly Leu Glu Pro Leu Arg Val 
290 295 300 



Arg Val Leu Ser He 
305 

Phe Arg He Asn Val 
320 

Arg Pro Trp Leu Gly 
335 

Val Tyr Pro Leu Tyr 
350 

Tyr Ser Val Pro Leu 
365 

Leu Thr Ala Phe Ala 
380 

Val Arg Gin Val Ser 
395 

Phe Trp Leu Met Ala 
410 

His Gly Leu Phe Asp 
425 

Leu Trp Trp Leu Cys 

440 

Gin Pro Ser Lys Gin 
455 

Lys Met 

(2) INFORMATION 
(i) SEQU: 





(A) 


LENGTH : 


1425 








(B) 


TYPE: 


nucleic acid 






(C) 


STRANDEDNESS : double 








(D) 


TOPOLOGY : 


linear 






(xi) SEQUENCE DESCRIPTION: SEQ ID 


NO: 4: 




ATGGTGTCTC 


CCATCTCTAT 


CTGGCGATCG 


CTGATGTTTG 


GCGGTTTTTC 


50 


CCCCCAGGAA 


TGGGGCCGGG 


GCAGTGTGCT 


CCATCGTTTG 


GTGGGCTGGG 


100 


GACAGAGTTG 


GATACAGGCT 


AGTGTGCTCT 


GGCCCCACTT 


CGAGGCATTG 


150 


GGTACGGCTC 


TAGTGGCAAT 


AATTTTTATT 


GCGGCTCCCT 


TCACCTCCAC 


200 


CACCATGTTG 


GGCATTTTTA 


TGCTGCTCTG 


TGGAGCCTTT 


TGGGCTCTGC 


250 


TGACCTTTGC 


TGATCAACCA 


GGGAAGGGTT 


TGACTCCCAT 


CCATGTTTTA 


300 


GTTTTTGCCT 


ACTGGTGCAT 


TTCGGCGATC 


GCCGTGGGAT 


TTTCTCCGGT 


350 


AAAAATGGCG 


GCGGCGTCGG 


GGTTAGCGAA 


ATTAACAGCT 


AATTTATGTC 


400 


TGTTTCTACT 


GGCGGCGAGG 


TTATTGCAAA 


ACAAACAATG 


GTTGAACCGG 


450 


TTAGTAACCG 


TTGTTTTACT 


GGTAGGGCTA 


TTGGTGGGGA 


GTTACGGTCT 


500 


GCGACAACAG 


GTGGACGGGG 


TAGAACAGTT 


AGCCACTTGG 


AATGACCCCA 


550 


CCTCTACCTT 


GGCCCAGGCC 


ACTAGGGTAT 


ATAGCTTTTT 


AGGTAATCCC 


600 


AATCTCTTGG 


CGGCTTACCT 


GGTGCCCATG 


ACGGGTTTGA 


GCTTGAGTGC 


650 


CCTGGTGGTA 


TGGCGACGGT 


GGTGGCCCAA 


ACTGCTGGGA 


GCAACCATGG 


7 00 


TGATTGTTAA 


CCTACTCTGT 


CTCTTTTTTA 


CCCAGAGCCG 


GGGCGGTTGG 


750 


CTAGCAGTGC 


TGGCCCTGGG 


AGCTACCTTC 


CTGGCCCTTT 


GTTACTTCTG 


800 


GTGGTTACCC 


CAATTACCCA 


AATTTTGGCA 


ACGGTGGTCT 


TTGCCCCTGG 


850 


CGATCGCCGT 


GGCGGTTATA 


TTAGGTGGGG 


GAGCGTTGAT 


TGCGGTGGAA 


900 


CCGATTCGAC 


TCAGGGCCAT 


GAGCATTTTT 


GCTGGGCGGG 


AAGACAGCAG 


950 


TAATAATTTC 


CGCATCAATG 


TTTGGGAAGG 


GGTAAAAGCC 


ATGATCCGAG 


1000 


CCCGCCCTAT 


CATTGGCATT 


GGCCCAGGTA 


ACGAAGCCTT 


TAACCAAATT 


1050 


TATCCTTACT 


ATATGCGGCC 


CCGCTTCACC 


GCCCTGAGTG 


CCTATTCCAT 


1100 


TTACCTAGAA 


ATTTTGGTGG 


AAACGGGTGT 


AGTTGGTTTT 


ACCTGTATGC 


1150 


TCTGGCTGTT 


GGCCGTTACC 


CTAGGCAAAG 


GCGTAGAACT 


GGTTAAACGC 


1200 


TGTCGCCAAA 


CCCTCGCCCC 


GGAAGGCATC 


TGGATTATGG 


GGGCTTTAGC 


1250 


GGCGATCATC 


GGTTTGTTGG 


TCCACGGCAT 


GGTAGATACA 


GTCTGGTACC 


1300 
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Phe Val Gly Arg Glu Asp Ser Ser Asn Asn 

310 315 
Trp Leu Ala Val Leu Gin Met He Gin Asp 

325 330 
He Gly Pro Gly Asn Thr Ala Phe Asn Leu 

340 345 
Gin Gin Ala Arg Phe Thr Ala Leu Ser Ala 

355 360 
Glu Val Ala Val Glu Gly Gly Leu Leu Gly 

370 375 
Trp Leu Leu Leu Val Thr Ala Val Thr Ala 

385 390 
Arg Leu Arg Arg Asp Arg Asn Pro Gin Ala 

400 405 
Ser Leu Ala Gly Leu Ala Gly Met Leu Gly 

415 420 
Thr Val Leu Tyr Arg Pro Glu Ala Ser Thr 

430 435 
He Gly Ala He Ala Ser Phe Trp Gin Pro 

445 450 
Leu Pro Pro Glu Ala Glu His Ser Asp Glu 
460 465 



FOR SEQ ID NO: 4 : 
ENCE CHARACTERISTICS: 
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GTCCCCCGGT GAGCACTTTG TGGTGGTTGC TAGTGGCCAT TGTTGCTAGT 1350 
CAGTGGGCCA GCGCCCAGGC CCGTTTGGAG GCCAGTAAAG AAGAAAATGA 1400 
GGACAAACCT CTTCTTGCTT CATAA 1425 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
Met Val Ser Pro He Ser He Trp Arg Ser Leu Met Phe Gly Gly 
5 10 15 

Phe Ser Pro Gin Glu Trp Gly Arg Gly Ser Val Leu His Arg Leu 
20 25 30 

Val Gly Trp Gly Gin Ser Trp He Gin Ala Ser Val Leu Trp Pro 
35 40 45 

His Phe Glu Ala Leu Gly Thr Ala Leu Val Ala He He Phe He 
50 55 60 

Ala Ala Pro Phe Thr Ser Thr Thr Met Leu Gly He Phe Met Leu 
65 70 75 

Leu Cys Gly Ala Phe Trp Ala Leu Leu Thr Phe Ala Asp Gin Pro 
80 85 90 

Gly Lys Gly Leu Thr Pro He His Val Leu Val Phe Ala Tyr Trp 
95 100 105 

Cys He Ser Ala He Ala Val Gly Phe Ser Pro Val Lys Met Ala 
110 115 120 

Ala Ala Ser Gly Leu Ala Lys Leu Thr Ala Asn Leu Cys Leu Phe 
125 130 135 

Leu Leu Ala Ala Arg Leu Leu Gin Asn Lys Gin Trp Leu Asn Arg 
140 145 150 

Leu Val Thr Val Val Leu Leu Val Gly Leu Leu Val Gly Ser Tyr 
155 160 165 

Gly Leu Arg Gin Gin Val Asp Gly Val Glu Gin Leu Ala Thr Trp 
170 175 180 

Asn Asp Pro Thr Ser Thr Leu Ala Gin Ala Thr Arg Val Tyr Ser 
185 190 195 

Phe Leu Gly Asn Pro Asn Leu Leu Ala Ala Tyr Leu Val Pro Met 
200 205 210 

Thr Gly Leu Ser Leu Ser Ala Leu Val Val Trp Arg Arg Trp Trp 
215 220 225 

Pro Lys Leu Leu Gly Ala Thr Met Val He Val Asn Leu Leu Cys 
230 235 240 

Leu Phe Phe Thr Gin Ser Arg Gly Gly Trp Leu Ala Val Leu Ala 
245 250 255 

Leu Gly Ala Thr Phe Leu Ala Leu Cys Tyr Phe Trp Trp Leu Pro 
260 265 270 

Gin Leu Pro Lys Phe Trp Gin Arg Trp Ser Leu Pro Leu Ala He 
275 280 285 

Ala Val Ala Val He Leu Gly Gly Gly Ala Leu He Ala Val Glu 
290 295 300 

Pro He Arg Leu Arg Ala Met Ser He Phe Ala Gly Arg Glu Asp 
305 310 315 

Ser Ser Asn Asn Phe Arg He Asn val Trp Glu Gly Val Lys Ala 
320 325 330 

Met He Arg Ala Arg Pro He He Gly He Gly Pro Gly Asn Glu 
335 340 345 



Ala Phe Asn 
Ala Leu Ser 
Gly Val Val 
Leu Gly Lys 
Ala Pro Glu 
Gly Leu Leu 
Pro Val Ser 
Gin Trp Ala 
Asn Glu Asp 



Gin 
Ala 
Gly 
Gly 
Gly 
Val 
Thr 
Ser 
Lys 



He Tyr 
350 

Tyr Ser 
365 

Phe Thr 
380 

Val Glu 
395 

He Trp 
410 

His Gly 
425 

Leu Trp 
440 

Ala Gin 
455 

Pro Leu 
470 



Pro Tyr 
He Tyr 
Cys Met 
Leu Val 
He Met 
Met Val 
Trp Leu 
Ala Arg 
Leu Ala 



Tyr Met 

355 
Leu Glu 

370 
Leu Trp 

385 
Lys Arg 

400 
Gly Ala 

415 
Asp Thr 

430 
Leu Val 

445 
Leu Glu 

460 

Ser 



67 

Arg Pro Arg 
He Leu Val 
Leu Leu Ala 
Cys Arg Gin 
Leu Ala Ala 
Val Trp Tyr 
Ala He Val 
Ala Ser Lys 



Phe Thr 

360 
Glu Thr 

375 
Val Thr 

390 
Thr Leu 

405 
He He 

420 
Arg Pro 

435 
Ala Ser 

450 
Glu Glu 

465 



(2) 



INFORMATION FOR SEQ ID NO: 6: 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



(xi) SEQUENCE DESCRIPTION: 
GGGCTAGCCG CGATCGCGGC CTATTGGGCC C 31 



31 

nucleic acid 

double 

linear 

SEQ ID NO: 6: 



INFORMATION FOR SEQ ID NO: 7: 



(i) 



SEQDEHCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS: 
TOPOLOGY: 



<xi) SEQUENCE DESCRIPTION: 
GGGCTAGGGA TCGCGCCTAT TGGGCCC 27 



27 

nucleic acid 

double 

linear 

SEQ ID NO: 7: 



(2) 



INFORMATION FOR SEQ ID NO: J 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
{C) 
(D) 



LENGTH : 
TYPE: 

STRANDEDNESS : 
TOPOLOGY : 



(xi) SEQUENCE DESCRIPTION: 
GGGCTCAGAT CGCGCCTATT GGGCCC 26 



26 

nucleic acid 

double 

linear 

SEQ ID NO: 8: 



(2) 



INFORMATION FOR SEQ ID NO : 9 : 



(i) 



(Xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH: 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION 
Gly Leu Ala Ala He Ala Ala Tyr Trp Ala Leu 
5 10 



11 

amino acid 
single 
linear 
SEQ ID KG: 9: 



